Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate the Future

نویسندگان

  • Ritchie Lee
  • David H. Wolpert
  • James W. Bono
  • Scott Backhaus
  • Russell Bent
  • Brendan Tracey
چکیده

This chapter introduces a novel framework for modeling interacting humans in a multi-stage game. This “iterated semi network-form game” framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another’s reward functions when predicting one another’s behavior), and (3) computational tractability even on real-world systems. We achieve these benefits by combining concepts from game theory and reinforcement learning. To be precise, we extend the bounded rational “level-K reasoning” model to apply to games over multiple stages. Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each Ritchie Lee Carnegie Mellon University Silicon Valley, NASA Ames Research Park, Mail Stop 23-11, Moffett Field, CA, 94035 e-mail: [email protected] David H. Wolpert Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501 Los Alamos National Laboratory, MS B256, Los Alamos, NM, 87545 e-mail: [email protected] James Bono American University, 4400 Massachusetts Ave. NW, Washington DC 20016 e-mail: [email protected] Scott Backhaus Los Alamos National Laboratory, MS K764, Los Alamos, NM 87545 e-mail: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outsourcing or Insourcing of Transportation System Evaluation Using Intelligent Agents Approach

Nowadays, outsourcing is viewed as a trade strategy and organizations tend to adopt new strategies to achieve competitive advantages in the current world of business. focusing on main copmpetencies, and transferring most of activities to outside resources of organization( outsourcing) is one such strategy is. In this paper, we aim to decide on decision maker agent of transportation system, by a...

متن کامل

Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing

Previous studies suggest that factual learning, that is, learning from obtained outcomes, is biased, such that participants preferentially take into account positive, as compared to negative, prediction errors. However, whether or not the prediction error valence also affects counterfactual learning, that is, learning from forgone outcomes, is unknown. To address this question, we analysed the ...

متن کامل

Comprehension of factual, nonfactual, and counterfactual conditionals by Iranian EFL learners

A considerable amount of studies have been established on conditional reasoning supporting mental model theory of propositional reasoning. Mental model theory proposed by Johnson- larid and Byrne is an explanation of someone's thought process about how something occurs in the real world. Conditional reasoning as a kind of reasoning is the way to speak about possibilities or probabilities. The a...

متن کامل

Hierarchical Decision Making

Decision making must be made within an appropriate context; we contend that such context is best represented by a hierarchy of states. The lowest levels of this hierarchy represent the observed raw data, or specific low-level behaviors and decisions. As we ascend the hierarchy, the states become increasingly abstract, representing higher order tactics, strategies, and over-arching mission goals...

متن کامل

Reinforcement Learning for Problems with Hidden State

In this paper, we describe how techniques from reinforcement learning might be used to approach the problem of acting under uncertainty. We start by introducing the theory of partially observable Markov decision processes (POMDPs) to describe what we call hidden state problems. After a brief review of other POMDP solution techniques, we motivate reinforcement learning by considering an agent wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013